Method and apparatus for receiving digital wireless transmissions using multiple-antenna communication schemes

ABSTRACT

A signal detection technique for multiple-input multiple-output (MIMO) communications systems embodied in a method and apparatus for detecting a plurality of transmitted signals with use of a plurality of receiving antennas. An iterative procedure decodes one of a plurality of transmitted signals at each iteration using an intermediate matrix at each iteration to determine the transmitted signal to be decoded. The intermediate matrix for each successive iteration is advantageously computed in a recursive manner with use of a Schur complement operation performed based on the inverse of a modified version of the intermediate matrix used in the previous iteration.

FIELD OF THE INVENTION

[0001] The present invention relates generally to the field of wireless radio-frequency communication systems, and more particularly to a method and apparatus for detecting transmitted signals in a digital wireless communication system employing multiple-antenna communications.

BACKGROUND OF THE INVENTION

[0002] Multiple-antenna communications systems, also known as Multiple-Input Multiple-Output (MIMO) systems, are known to be able to achieve very high spectral efficiencies in scattering environments, with no increase in bandwidth or transmitted power. In particular, it is known that such a multipath wireless channel is capable of huge capacities, provided that the multipath scattering is sufficiently rich and is properly exploited through the use of an appropriate processing architecture and multiple antennas (both at transmission and reception).

[0003] One such MIMO system is described, for example, in U.S. Pat. No. 6,097,771, issued on Aug. 1, 2000 to G. Foschini, entitled “Wireless Communications System Having A Layered Space-Time Architecture Employing Multi-Element Antennas.” U.S. Pat. No. 6,097,771, which is commonly assigned to the assignee of the present invention, is hereby incorporated by reference as if fully set forth herein. The architecture described in U.S. Pat. No. 6,097,771 has been shown to be theoretically capable of approaching the Shannon capacity for multiple transmitters and receivers. (As is well-known to those of ordinary skill in the art, the Shannon capacity of a system refers to the information-theoretic capacity limit of the system.)

[0004] Another such MIMO system is described, for example, in U.S. Pat. No. 6,317,466, issued on Nov. 13, 2001 to G. Foschini et al., entitled “Wireless Communications System Having A Space-Time Architecture Employing Multi-Element Antennas At Both The Transmitter And Receiver” (hereinafter “Foschini et al.”). U.S. Pat. No. 6,317,466, which is also commonly assigned to the assignee of the present invention, is also hereby incorporated by reference as if fully set forth herein. The architecture described in Foschini et al. provides for a technique having a significantly lower computational complexity than that of U.S. Pat. No. 6,097,771, but which nonetheless can still achieve a substantial portion of the Shannon capacity.

[0005] Specifically, in the system of Foschini et al. a data stream is split into M uncorrelated sub-streams of symbols, each of which is transmitted by one of M transmitting antennas. The M sub-streams are picked up by N receiving antennas after having been perturbed by a channel matrix H. (The channel matrix H represents the signal interference or signal loss which naturally occurs as a result of the transmission channel.) The sub-stream signal with the highest signal-to-noise ratio is advantageously detected first and this involves the calculation of the pseudo-inverse of H or the calculation of a minimum mean-square error filter. The effect of the detected symbol as well as the effect of the corresponding transmission channel is then advantageously removed (mathematically) from the N received signals. This process repeats with the next strongest sub-stream signal among the remaining undetected signals. Thus, this approach detects M symbols (one from each of the M sub-streams) in M iterations. Moreover, it has been proven that this decoding order is optimal from a performance point of view. However, the computational complexity of the Foschini et al. technique is still reasonably high (albeit lower than that of U.S. Pat. No. 6,097,771).

[0006] This complexity problem was addressed, for example, in U.S. patent application Ser. No. 09/438,900, filed on Nov. 12, 1999 by B. Hassibi, entitled “Method And Apparatus For Receiving Wireless Transmissions Using Multiple-Antenna Arrays” (hereinafter “Hassibi”). U.S. patent application Ser. No. 09/438,900 is commonly assigned to the assignee of the present invention and is hereby incorporated by reference as if fully set forth herein. In Hassibi, it was recognized that mathematical matrix inversion operations are inherently costly (in computational complexity), and, making use of that recognition, an improved technique for detecting the M transmitted signals was disclosed. In particular, in the technique of Hassibi, as each transmitted symbol is detected the effect of the detected symbol and of the corresponding channel is advantageously subtracted from the N received signals without performing any mathematical matrix inversion operations.

[0007] Although the prior art techniques such as that of Foschini et al. and especially that of Hassibi have considerably reduced the computational complexity of the signal detection process for MIMO systems over the earlier techniques, their complexity nonetheless rises significantly as the number of antennas grow. That is, while reasonably efficient when used with a modest number of antennas, these techniques become more cumbersome particularly when the number of transmitting antennas becomes large (e.g., greater than 10). Therefore, an improved signal detection technique for MIMO systems, whose computational complexity does not increase as quickly with increasing numbers of antennas, would be highly desirable.

SUMMARY OF THE INVENTION

[0008] In accordance with the present invention, an improved signal detection technique for MIMO systems is embodied in a method and apparatus for detecting a plurality of transmitted signals with use of a plurality of receiving antennas. In particular, and in accordance with one illustrative embodiment of the present invention, an iterative procedure decodes one of a plurality of transmitted signals at each iteration using an intermediate matrix at each iteration to determine the transmitted signal to be decoded. The intermediate matrix for each successive iteration is advantageously computed in a recursive manner with use of a Schur complement operation performed based on the inverse of a modified version of the intermediate matrix used in the previous iteration. (A Schur complement is a well-known matrix operation fully familiar to those skilled in the art.)

[0009] More specifically, a method and apparatus for detecting a plurality of transmitted signals transmitted across a channel by respective transmitting antenna elements in a multiple-input multiple-output communications system is provided. The method, for example, comprises the steps of (a) collecting a plurality of received signals from respective receiving antenna elements in said communications system; (b) determining a channel matrix H of estimated channel coefficients based on said plurality of received signals; (c) computing an estimate of a selected one of said transmitted signals, said estimate based on said plurality of received signals and on an intermediate matrix Q, thereby resulting in detection of the selected one of said transmitted signals, wherein said intermediate matrix Q is a function of the channel matrix H; and (d) repeating at least step (c) one or more times to detect an additional one or more of said transmitted signals, wherein said intermediate matrix Q as used in step (c) for each such repeated execution thereof is re-computed based on a function of an inverse of a Schur complement of an element in the inverse of a modified version of the intermediate matrix Q used in the previous execution of step (c).

BRIEF DESCRIPTION OF THE DRAWINGS

[0010]FIG. 1 shows an illustrative MIMO system architecture in which an illustrative embodiment of the present invention may be advantageously employed.

[0011]FIG. 2 shows an illustration of a how the transmission channel for a MIMO system may be represented as a channel matrix in accordance with an illustrative embodiment of the present invention.

[0012]FIG. 3 shows an illustration of how an estimation matrix may be advantageously used to decode the plurality of transmitted signals in accordance with an illustrative embodiment of the present invention.

[0013]FIG. 4 shows a flowchart of a sequential nulling and cancellation scheme for use in decoding the plurality of transmitted signals in accordance with a prior art technique.

[0014]FIG. 5 shows a flow chart of a sequential nulling and cancellation scheme for use in decoding the plurality of transmitted signals in accordance with an illustrative embodiment of the present invention.

DETAILED DESCRIPTION

[0015] An Overview of an Illustrative MIMO System Architecture

[0016]FIG. 1 shows an illustrative MIMO system architecture in which an illustrative embodiment of the present invention may be advantageously employed. As shown in the figure, a single data stream is transmitted across a wireless channel with use of a communications link comprising M transmitting antennas and N receiving antennas. The channel is advantageously presumed to be a Rayleigh channel, familiar to those of ordinary skill in the art. In particular, the channel is flat-fading (meaning that the signals are narrow-band), and it is presumed that there is no Inter-Symbol Interference (ISI) between adjacent symbols.

[0017] The figure shows transmitter 10, channel 14 and receiver 15. Specifically, demultiplexer 11 of transmitter 10 separates the data stream into M uncorrelated data sequences which are then modulated by modulators 12-1 through 12-M respectively into M substreams, s₁(k), . . . s_(M)(k), each of which is then sent through corresponding transmitting antennas 13-1 through 13-M. After traversing channel 14 (whose behavior is represented by the matrix H—see discussion of FIG. 2, below), the transmitted signals are captured by receiver 15, specifically comprising receiving antennas 16-1 through 16-N and corresponding demodulators 17-1 through 17-N, which produce intermediate signals x₁(k) through x_(N)(k), respectively. Then, in accordance with the principles of the present invention, signal detector 18 generates the recovered data stream from intermediate signals x₁(k) through x_(N)(k) in accordance, for example, with an illustrative embodiment of the present invention.

[0018]FIG. 2 shows an illustration of a how the transmission channel for a MIMO system may be represented as a channel matrix in accordance with an illustrative embodiment of the present invention. In particular, the figure shows transmitting antennas 13-1 through 13-M, used to transmit signals s₁(k), s₂(k), . . . , s_(M)(k), respectively; channel matrix H representative of transmission channel 14; and the portion of receiver 15 comprising receiving antennas 16-1 through 16-N and corresponding demodulators 17-1 through 17-N, which produce intermediate signals x₁(k) through x_(N)(k), respectively. Also shown are summation elements 21-1 through 21-N which represent the (unavoidable) inclusion of noise signals w₁(k) through w_(N)(k), respectively, into the signals received by receiving antennas 16-1 through 16-N.

[0019] Specifically, channel matrix H comprises channel vectors h_(:1) through h_(:M), representing the channel's effect on each of the M transmitted signals, respectively. More specifically, channel vector h_(:1) comprises channel matrix entries h₁₁ through h_(N1), representing the channel's effect on transmitted signal s₁(k) at each of receiving antennas 16-1 through 16-N, respectively; channel vector h_(:2) comprises channel matrix entries h₁₂ through h_(N2), representing the channel's effect on transmitted signal s₂(k) at each of receiving antennas 16-1 through 16-N, respectively, . . . , and channel vector h_(:M) comprises channel matrix entries h_(1M) through h_(NM), representing the channel's effect on transmitted signal s_(M)(k) at each of receiving antennas 16-1 through 16-N, respectively.

[0020] The operation of the illustrative MIMO architecture is described more formally, below. First, at the receivers, at sample time k, we have: $\begin{matrix} \begin{matrix} \begin{matrix} \begin{matrix} {{x(k)} = {{\sum\limits_{m = 1}^{M}\quad {h_{:m}{s_{m}(k)}}} + {w(k)}}} \\ {\quad {{= {{H\quad {s(k)}} + {w(k)}}},\quad {k = 1},2,{\ldots \quad K},}} \end{matrix} \\ {where} \end{matrix} \\ \begin{matrix} {{x(k)} = \begin{bmatrix} {x_{1}(k)} & {x_{2}(k)} & L & {x_{N}(k)} \end{bmatrix}^{T}} \\ {\quad {= \begin{bmatrix} {{h_{1:}^{H}{s(k)}} + {w_{1}(k)}} & {{h_{2:}^{H}{s(k)}} + {w_{2}(k)}} & L & {{h_{N:}^{H}{s(k)}} + {w_{N}(k)}} \end{bmatrix}^{T}}} \end{matrix} \end{matrix} & (1) \end{matrix}$

[0021] is the N-dimensional received vector, $\begin{matrix} {H = \begin{bmatrix} h_{11} & h_{12} & L & h_{1\quad M} \\ h_{21} & h_{22} & L & h_{2M} \\ M & M & O & M \\ h_{N1} & h_{N2} & L & h_{NM} \end{bmatrix}} \\ {= \begin{bmatrix} h_{:1} & h_{:2} & L & h_{.M} \end{bmatrix}} \\ {= \begin{bmatrix} h_{1 \cdot}^{H} \\ h_{2:}^{H} \\ M \\ h_{N:}^{H} \end{bmatrix}} \end{matrix}$

[0022] is an N×M complex matrix assumed to be constant for K symbol periods, vectors h_(n:) and h_(m) are respectively of length M and N,

s(k)=[s ₁(k)s ₂(k)Ls _(M)(k)]^(T)

[0023] is the M-dimensional transmitted vector,

w(k)=[w ₁(k)w ₂(k)Lw _(N)(k)]^(T)

[0024] is a zero-mean complex additive white Gaussian noise (AWGN) vector with covariance $\begin{matrix} \begin{matrix} {R_{ww} = {E\left\{ {{w(k)}{w^{H}(k)}} \right\}}} \\ {{= {\sigma_{w}^{2}I_{N \times N}}},} \end{matrix} & (2) \end{matrix}$

[0025] and ^(T) and ^(H) denote respectively the transpose and the conjugate transpose of a matrix or a vector, and where I_(N×N) represents the N×N identity matrix. (It will be assumed herein that the additive noise w(k) is independent both in time and space.)

[0026] The transmitted vector s(k) has a total power P_(T). This power is advantageously held constant regardless of the number of transmitting antennas M and corresponds to the trace of the covariance matrix of the transmitted vector: $\begin{matrix} \begin{matrix} {P_{T} = {{{tr}\left\lbrack R_{ss} \right\rbrack} = {Constant}}} \\ {= {\sum\limits_{m = 1}^{M}\quad {\sigma_{s_{m}}^{2}.}}} \end{matrix} & (3) \end{matrix}$

[0027] It will be assumed herein that all of the antennas transmit with the same power,

σ_(s) ₁ ²=σ_(s) ₂ ² =Kσ _(s) _(M) ²=σ_(s) ²,

[0028] such that

P _(T) =Mσ _(s) ².  (4)

[0029] Now define a parameter ρ that relates P_(T) and σ_(w) ² as follows: $\begin{matrix} {\rho = \frac{P_{T}}{\sigma_{w}^{2}}} & (5) \end{matrix}$

[0030] This parameter advantageously corresponds to the average receive signal-to-noise ratio (SNR) per antenna.

[0031] Therefore, in accordance with the principles of the present invention, an original information sequence for wireless transmission is demultiplexed into M data sequences s_(m)(k), m=1, . . . , M (called substreams), and each one of them is sent through a transmitting antenna. These M substreams are assumed to be uncorrelated, which implies that the covariance matrix of the transmitted vector s(k) is diagonal: $\begin{matrix} \begin{matrix} {R_{ss} = {E\left\{ {{s(k)}{s^{H}(k)}} \right\}}} \\ {= {\sigma_{s}^{2}I_{M \times M}}} \end{matrix} & (6) \end{matrix}$

[0032] Also assume that N≧M and that H has full column rank, i.e., that rank[H]=M.

[0033] Assume that the transmitter has no knowledge of the channel. In this case, the Shannon capacity of the (M, N) flat-faded channel is given by the following well-known formula, familiar to those skilled in the art: $\begin{matrix} \begin{matrix} {C = {{\log_{2}\left\lbrack {\det \left( {I_{N \times N} + {\frac{\rho}{M}{HH}^{H}}} \right)} \right\rbrack}\left\lbrack {{bps}\text{/}{Hz}} \right\rbrack}} \\ {= {{\log_{2}\left\lbrack {\det \left( {I_{M \times M} + {\frac{\rho}{M}H^{H}H}} \right)} \right\rbrack}.}} \end{matrix} & (7) \end{matrix}$

[0034] One important observation that can be made from Equation (7) is that, for rich scattering channels (meaning that the elements of the channel matrix are independent of one another), the MIMO channel capacity grows roughly proportionally to M.

[0035] An Illustrative MIMO Signal Detection Technique

[0036] In a MIMO system such as the illustrative system shown in FIG. 1, the detection of the transmitted symbols at the receivers typically comprises first determining the value of the complex channel matrix H, or, more precisely, an estimate thereof. Most typically, and as is well known to those skilled in the art, H is determined by having the transmitter send a training sequence (comprising a sequence of symbols which is known in advance by the receiver) at the beginning of each burst. (As is also well known, most communications systems separate the sequence of symbols to be transmitted into individual portions referred to as bursts.) The length of a burst is advantageously equal to K=K₁+K₂ symbols where the K₁ symbols are used for training and the K₂ symbols are used for the transmission of the actual data information. The propagation coefficients (i.e., the values of the elements of the matrix H) may then be assumed to be constant during an entire burst (since it occurs over a reasonably short period of time), after which they may change to new independent random values which it is assumed they maintain for another K symbols, and so on. Note that no distinction will be made between H and its estimate herein.

[0037] Thus, given the channel matrix H and the set of received signals x(k) as described above, the goal of a MIMO system is to determine (or, more precisely, to estimate) the transmitted signals s(k). Note in particular that the transmitted signals s(k) have been coded (illustratively by modulators 12-1 through 12-M of FIG. 1) with use of a predetermined symbol constellation, and therefore, the estimates of the transmitted signals (which will be referred to herein as ŝ(k)=[ŝ₁ŝ₂Lŝ_(M)]^(T)) should fall into that constellation.

[0038] As is well know to those skilled in the art, one general approach to the problem of signal detection in MIMO systems comprises a Sequential Nulling and Cancellation (SNC) technique, such as is described, for example, in Foschini et al. An SNC technique typically consists of performing the following steps:

[0039] 1. The M transmitted signals are first estimated by filtering the N received signals using, for example, either the well-known minimum mean-square-error (MMSE) filtering technique or the well-known zero forcing (ZF) technique. In either case, an estimation matrix (which we will identify herein as G) is the mathematical result.

[0040] 2. Among the M estimated source signals, the substream with the smallest estimation variance or the strongest SNR is chosen for detection.

[0041] 3. The interference contributed by the detected substream is cancelled (removed or “subtracted out”) from the N received signals.

[0042] 4. Return to step 1 and repeat with the number of substreams advantageously reduced by one, until all substreams have been detected.

[0043] Thus, it can be seen that with M iterations, each of the substreams are advantageously detected.

[0044]FIG. 3 shows an illustration of how an estimation matrix may be advantageously used to decode the plurality of transmitted signals in accordance with an illustrative embodiment of the present invention. Illustratively, the figure shows a MIMO environment with three transmitting antennas and four receiving antennas. (That is, in accordance with the description provided herein, M=3 and N=4.) Note that the use of an estimation matrix as shown in FIG. 3 is a characteristic of both certain prior art MIMO signal detection schemes and certain MIMO signal detection schemes in accordance with various illustrative embodiments of the present invention.

[0045] As shown in the figure, estimation matrix 31 (G), which may be advantageously determined with use of either the MMSE technique or the ZF technique, is used to compute the estimates of the transmitted signals. In particular the signal vector y(k) is computed as y(k)=G^(H)x(k) at each iteration of the SNC process. Then, as each substream, y_(i)(k), for some i (which represents the substream with the smallest estimation variance or the strongest SNR) is detected in turn, the corresponding estimate of the transmitted signal may be advantageously determined by computing ŝ_(i)(k)=Q{y_(i)(k)}, where Q[·] represents the quantization procedure according to the constellation being used. As further shown in the figure, substream 2 is illustratively detected first, followed by substream 1 and finally by substream 3.

[0046] A Prior Art MIMO Signal Decoding Technique

[0047]FIG. 4 shows a high-level flowchart of a sequential nulling and cancellation scheme for use in decoding the plurality of transmitted signals in accordance with a prior art technique. In this prior art technique, the estimation matrix (e.g., estimation matrix 31, G, as shown in FIG. 3) is computed with use of either the pseudo-inverse of the channel matrix H or, preferably, the MMSE filter G. In particular, define an error vector signal at time k between the input s(k) and its estimate: $\begin{matrix} \begin{matrix} {{e(k)} = {{s(k)} - {y(k)}}} \\ {= {{s(k)} - {G^{H}{{x(k)}.}}}} \end{matrix} & (8) \end{matrix}$

[0048] Now, define the error criterion: $\begin{matrix} \begin{matrix} {J = {E\left\{ {{e^{H}(k)}{e(k)}} \right\}}} \\ {= {{{tr}\left\lbrack {E\left\{ {{e(k)}{e^{H}(k)}} \right\}} \right\rbrack}.}} \end{matrix} & (9) \end{matrix}$

[0049] The minimization of Equation (9) leads to the Wiener-Hopf equation, familiar to those skilled in the art—namely:

G ^(H) R _(xx) =R _(sx),  (10)

[0050] where

R _(xx) =E{x(k)x ^(H)(k)}  (11)

[0051] is the output signal covariance matrix, and

R _(sx) =E{s(k)x ^(H)(k)}  (12)

[0052] is the cross-correlation matrix between the input and output signals.

[0053] From Equation (10), it can be seen that the MMSE filter is:

G=[HH ^(H) +αI _(N×N)]⁻¹ H,  (13)

[0054] where $\begin{matrix} {\alpha = {\frac{\sigma_{w}^{2}}{\sigma_{s}^{2}}.}} & (14) \end{matrix}$

[0055] It can easily be seen that Equation (13) is equivalent to: $\begin{matrix} \begin{matrix} {G = {H\left\lbrack {{H^{H}H} + {\alpha \quad I_{M \times M}}} \right\rbrack}^{- 1}} \\ {= {{HQ}.}} \end{matrix} & (15) \end{matrix}$

[0056] The second form—that is, Equation (15)—is more useful and more efficient in practice since M≦N and the size of the matrix to be inverted in Equation (15) is either smaller than or equal in size to the matrix to be inverted in Equation (13).

[0057] Instead of the MMSE filter, a prior art SNC technique can alternatively directly use the pseudo-inverse (familiar to those skilled in the art) of H, which is:

G _(PI) ^(H) =[H ^(H) H] ⁻¹ H ^(H)  (16)

[0058] As can easily be seen, the only difference between the matrices G and G_(PI) is that the G is “regularized” by a diagonal matrix αI_(M×M) while G_(PI) is not. This regularization introduces a bias but Equation (15) actually gives a much more reliable result than Equation (16) when the matrix H^(H)H is ill-conditioned and the estimation of the channel is noisy. In practice, depending on the condition number of the matrix H^(H)H, a different value may be used for α than the one given in Equation (14). For example, if this condition number is very high and the SNR is also high, it will be better to take a higher value for α. Thus, the MMSE filter can be seen as a biased pseudo-inverse of H.

[0059] More specifically, in the prior art SNC algorithm being described, the detection of the symbols s_(m)(k) is performed over M iterations. Note that the order in which the components of s(k) are detected is important to the overall performance of the system. Let the ordered set

S={p₁,p₂,L,p_(M)}  (17)

[0060] be a permutation of the integers 1, 2, . . . , M specifying the order in which components of the transmitted symbol vector s(k) are extracted.

[0061] Thus, returning to FIG. 4, the illustrative prior art SNC technique comprises the following steps.

[0062] Initialization Step (as Shown in Block 40 of the Figure):

[0063] Use the training sequence to determine the matrix H and set the initial matrix H_(M)=H for the first iteration. Also, determine the received signals x(k) and set x₁(k)=x(k) for the first iteration.

[0064] Step 1 (as Shown in Block 41 of the Figure):

[0065] Using the MMSE filter or the pseudo-inverse, compute:

y(k)=G ^(H) x(k).  (18)

[0066] (Note that y(k) represents the estimates of the transmitted signals.) In particular, if using the MMSE filter approach, matrix G is computed by Equation (13) above; is using the zero-forcing approach, matrix G is computed by Equation (16) above. In addition, x(k) and H are determined in the initialization sequence for the first iteration, and as described below (with reference to block 44 of the figure) for all subsequent iterations.

[0067] Step 2 (as Shown in Block 42 of the Figure):

[0068] The element of y(k) with the highest SNR is detected. This element is associated with the smallest diagonal entry of Q for the MMSE filter (as will be more clearly explained below), or the column of G having the smallest norm for the pseudo-inverse (in the case where zero-forcing has been used). For example, if such a column determined in the m'th iteration is p_(m), then the estimate of the chosen transmitted signal is given by:

ŝ _(p) _(m) (k)=Q[y _(p) _(m) (k)],  (19)

[0069] with Q[·] indicating the slicing or quantization procedure in accordance with the given symbol constellation in use.

[0070] Step 3 (as Shown in Block 43 of the Figure):

[0071] Assuming that ŝ_(p) _(m) (k)=s_(p) _(m) (k), then s_(p) _(m) (k) is cancelled from the received vector x(k), resulting in a modified received vector, namely: $\begin{matrix} \begin{matrix} {{x_{2}(k)} = {{x(k)} - {{s_{p_{1}}(k)}h_{:p_{1}}}}} \\ {= {{\sum\limits_{m \neq p_{1}}^{\quad}{h_{:m}{s_{m}(k)}}} + {w(k)}}} \\ {{= {{H_{M - 1}{s_{M - 1}(k)}} + {w(k)}}},} \end{matrix} & (20) \end{matrix}$

[0072] where H_(M−1) is an N×(M−1) matrix derived from H by removing its p₁'th column and s_(M−1)(k) is a vector of length M−1 obtained from s(k) by removing its p₁'th component.

[0073] Step 4 (as Shown in Block 44 of the Figure):

[0074] Unless all M transmitted signals have already been decoded, steps 1-3 are repeated for components p₂,L,p_(M) by operating in turn on the progression of modified received vectors x₂(k),L,x_(M)(k). Note that at the m'th iteration, the N×(M−m) matrix H_(M−m) may be derived from H by removing m of its columns—namely, columns p₁,L,p_(m). It is well known that this ordering (i.e., choosing the transmitted signal having the highest SNR at each iteration in the detection process) is optimal among all possible orderings.

[0075] The following more formally summarizes the illustrative prior art SNC algorithm (using the MMSE filter):

[0076] Initialization and First Iteration:

[0077] x₁(k)=x(k), H_(M)=H=[h_(M,:1)h_(M,:2)Lh_(M,:M)]

[0078] Q_(M)=[q_(M,:1)q_(M,:2)Lq_(M,:M)]=[H_(M) ^(H)H_(M)+αI_(M×M)]⁻¹

[0079] f(k)=[12LM]^(T)

[0080] l₁=arg min q_(M,:ii), p₁=f_(l) ₁ (k) y_(p₁)(k) = q_(M,  : l₁)^(H)H_(M)^(H)x₁(k)

[0081] Move the l₁'th entry of vector f(k) to the end

[0082] ŝ_(p) ₁ (k)=Q[y_(p) ₁ (k)]

[0083] Recursion (i.e., Subsequent Iterations), for m=1, 2, K, M−1:

[0084] (a) x_(m+1)(k)=x_(m)(k)−ŝ_(p) _(m) (k)h_(M,:l) _(m)

[0085] (b) Determine H_(M−m) by removing the l_(m)'th column of H_(M−m+1) (c)  Q_(M − m) = [H_(M − m)^(H)H_(M − m) + α  I_((M − m) × (M − m))]⁻¹

[0086] (d) I_(m+1)=arg min q_(M−m,ii), p_(m+1)=f_(l) _(m+1) (k) (e)  y_(p_(m + 1))(k) = q_(M − m, l_(m + 1))^(H)H_(M − m)^(H)x_(m + 1)(k)

[0087] (f) Move the l₁'th entry of vector f(k) to the position behind the (M−m)'th entry

[0088] (g) ŝ_(p) _(m+1) (k)=Q[y_(p) _(m+1) (k)]

[0089] Solutions:

[0090] The estimates of the transmitted signals: [ŝ_(p) ₁ (k) ŝ_(p) ₂ (k)Lŝ_(p) _(M) (k)]^(T)

[0091] The decoding order: f(k)=[p_(M)p_(M−1)Lp₁]^(T)

[0092] An Illustrative MIMO Decoding Technique According to the Present Invention

[0093]FIG. 5 shows a flow chart of a sequential nulling and cancellation scheme for use in decoding the plurality of transmitted signals in accordance with an illustrative embodiment of the present invention. In accordance with the illustrative embodiment of the present invention, the matrix G is advantageously computed indirectly (rather than directly). Specifically, recall that:

G=HR ⁻¹  (21)

[0094] where

R=H ^(H) H+αI _(M×M)  (22)

[0095] The covariance matrix of the error signal, e(k)=s(k)−y(k), is: $\begin{matrix} \begin{matrix} {R_{cc} = {E\left\{ {{e(k)}{e^{H}(k)}} \right\}}} \\ {= {\sigma_{w}^{2}R^{- 1}}} \\ {= {\sigma_{2}^{w}{Q.}}} \end{matrix} & (23) \end{matrix}$

[0096] Clearly, the element of y(k) with the highest SNR is the one with the smallest error variance, so that: $\begin{matrix} {p_{1} = {\arg \quad {\min\limits_{m}\quad {q_{mm}.}}}} & (24) \end{matrix}$

[0097] where q_(mm) are the diagonal elements of the matrix Q=R⁻¹.

[0098] The matrix R can be rewritten as follows: $\begin{matrix} {{R = {{\sum\limits_{n = 1}^{N}{h_{n}h_{n:}^{H}}} + {\alpha \quad I_{M \times M}}}},} & (25) \end{matrix}$

[0099] which means that R can be advantageously computed recursively in N iterations, as follows: $\begin{matrix} \begin{matrix} {R_{\lbrack l\rbrack} = {{\sum\limits_{n = 1}^{l}{h_{n^{*}}h_{n:}^{H}}} + {\alpha \quad I_{M \times M}}}} \\ {= {R_{\lbrack{l - 1}\rbrack} + {h_{l:}h_{l:}^{H}}}} \end{matrix} & (26) \end{matrix}$

[0100] and

R _([N]) =R, R _([0]) =αI _(M×M)  (27)

[0101] Using the Sherman-Morrison formula (a well-known mathematical transformation fully familiar to those skilled in the art and also known as the “second lemma inversion”), Q can also be computed recursively, as follows: $\begin{matrix} {Q_{\lbrack l\rbrack} = {Q_{\lbrack{l - 1}\rbrack} - \frac{Q_{\lbrack{l - 1}\rbrack}h_{l:}h_{l:}^{H}Q_{\lbrack{l - 1}\rbrack}}{1 + {h_{l:}^{H}Q_{\lbrack{l - 1}\rbrack}h_{l:}}}}} & (28) \end{matrix}$

[0102] With the initialization ${Q_{\lbrack 0\rbrack} = {\frac{1}{\alpha}I_{M \times M}}},$

[0103] we obtain

Q _([N]) =[H ^(H) H+αI _(M×M)]⁻¹.

[0104] Note that if the process begins at iteration M+1 with the initialization ${Q_{\lbrack M\rbrack} = {\sum\limits_{n = 1}^{M}\quad {h_{n}h_{n}^{H}}}},$

[0105] we obtain

Q _([N]) =[H ^(H) H]⁻¹.

[0106] Note that it is well known that the computation of any recursion introduces potential numerical instabilities because of the finite precision of processor units. This instability however occurs only after a very large number of iterations. Since, in this case, the number of iterations to compute Q is limited by the number of receiving antennas N, such numerical instabilities are unlikely to occur. In any event, the numerical stability can be advantageously improved by increasing the value of α at the time of initialization.

[0107] Note also that in accordance with the illustrative embodiment of the present invention as described herein, Equation (28) is advantageously computed only one time at the first iteration. Once Q_([N]) is computed, p₁ may be easily determined based on Equation (24) above.

[0108] Now, continuing the illustrative process for the first iteration, the input estimate may be computed as follows: $\begin{matrix} {{y_{p_{1}}(k)} = {\sum\limits_{m = 1}^{M}{q_{p_{1},m}h_{\,^{.}m}^{H}{x(k)}}}} & (29) \end{matrix}$

[0109] and

ŝ _(p) ₁ (k)=Q[y _(p) ₁ (k)].  (30)

[0110] Note that the last step of the illustrative decoding procedure in accordance with the present invention is the same as the last step (“Step 3”) of the prior art approach described above.

[0111] For each of the following iterations (after the first), the process is as follows. First, note that the matrix Q can advantageously be deflated recursively. Specifically, note that: $\begin{matrix} {Q_{\lbrack N\rbrack} = {Q = {\left\lbrack {{H^{H}H} + {\alpha \quad I_{M \times M}}} \right\rbrack^{- 1} = {{R^{- 1}\begin{bmatrix} {{h_{1}^{H}h_{\text{:}1}} + \alpha} & {h_{1}^{H}h_{2}} & L & {h_{1}^{H}h_{\text{:}M}} \\ {h_{\,^{.}2}^{H}h_{\text{:}1}} & {{h_{\text{:}2}^{H}h_{\text{:}2}} + \alpha} & L & {h_{\text{:}2}^{H}h_{\text{:}M}} \\ M & M & O & M \\ {h_{\text{:}M}^{H}h_{\text{:}1}} & {h_{\text{:}M}^{H}h_{\text{:}2}} & L & {{h_{\text{:}M}^{H}h_{\text{:}M}} + \alpha} \end{bmatrix}}.}}}} & (31) \end{matrix}$

[0112] After the value of p₁ which corresponds to the element y_(p) ₁ (k) with the smallest variance is determined, we can advantageously interchange the p₁'th and the M'th entries of the transmitted signal s(k) such that the M'th signal becomes the current best estimate. Of course, in accordance with the illustrative embodiment of the present invention, the indices of the transmitted signals are advantageously tracked after such a reordering. Accordingly, the p₁'th and the M'th columns of the channel matrix H are advantageously interchanged. This can, for example, be easily achieved by post-multiplying H with a permutation matrix P_(p) ₁ _(M) which is given by: $P_{p_{1}M} = {\begin{bmatrix} 1 & 0 & L & \quad & \quad & \quad & L & 0 \\ 0 & O & \quad & \quad & \quad & \quad & \quad & M \\ M & \quad & 1 & \quad & \quad & \quad & \quad & \quad \\ \quad & \quad & \quad & 0 & L & \quad & \quad & 1 \\ \quad & \quad & \quad & M & 1 & \quad & \quad & 0 \\ \quad & \quad & \quad & \quad & \quad & O & \quad & M \\ \quad & \quad & \quad & \quad & \quad & \quad & 1 & 0 \\ 0 & L & 0 & 1 & 0 & L & 0 & 0 \\ \quad & \quad & \quad & \uparrow & \quad & \quad & \quad & \uparrow \\ \quad & \quad & \quad & p_{1} & \quad & \quad & \quad & M \end{bmatrix}_{M \times M}.}$

[0113] Since

(HP _(p) ₁ _(M))^(H)(HP _(p) ₁ _(M))+αI _(M×M) =P _(p) ₁ _(M)(H ^(H) H+αI _(M×M))P _(p) ₁ _(M),  (32)

[0114] it follows that the rows and columns p₁ and M of the matrix R may be advantageously permuted. Equivalently, in accordance with an alternative illustrative embodiment of the present invention, the rows and columns p₁ and M of the matrix Q may be permuted. This can be easily seen from the fact that

(P _(p) ₁ _(M) RP _(p) ₁ _(M))⁻¹ =P _(p) ₁ _(M) R ⁻¹ P _(p) ₁ _(M) =P _(p) ₁ _(M) QP _(p) ₁ _(M).  (33)

[0115] Note that these permutations advantageously allow for the removal of the effect of the channel h_(:p) ₁ quite easily. Specifically: $\begin{matrix} {{{Q_{M} = \begin{bmatrix} {{h_{\text{:}1}^{H}h_{\text{:}1}} + \alpha} & {h_{\text{:}1}^{H}h_{\text{:}2}} & L & {h_{\text{:}1}^{H}h_{\text{:}p_{1}}} \\ {h_{\,^{.}2}^{H}\quad h_{\text{:}1}} & {{h_{\text{:}2}^{H}h_{\text{:}2}} + \alpha} & L & {h_{\text{:}2}^{H}h_{\text{:}p_{1}}} \\ M & M & O & M \\ {h_{\text{:}p_{1}}^{H}h_{\text{:}1}} & {h_{\text{:}p_{1}}^{H}h_{\text{:}2}} & L & {{h_{\text{:}p_{1}}^{H}h_{\text{:}p_{1}}} + \alpha} \end{bmatrix}},\quad {= \begin{bmatrix} R_{M - 1} & v_{M - 1} \\ v_{M - 1}^{H} & \beta_{p_{1}} \end{bmatrix}^{- 1}}}{where}{{\beta_{p_{1}} = {{h_{\text{:}p_{1}}^{H}h_{p_{1}}} + \alpha}},{v_{M - 1} = \begin{bmatrix} {h_{\text{:}1}^{H}h_{\text{:}p_{1}}} & {h_{\text{:}2}^{H}h_{\text{:}p_{1}}} & L & {h_{{\text{:}M} - 1}^{H}h_{p_{1}}} \end{bmatrix}^{T}},{and}}{R_{M - 1} = {{H_{M - 1}^{H}H_{M - 1}} + {\alpha \quad {I_{{({M - 1})} \times {({M - 1})}}.}}}}} & (34) \end{matrix}$

[0116] It can easily be shown that: $\begin{matrix} {\quad {{Q_{M} = \begin{bmatrix} T_{M - 1}^{- 1} & {{- T_{M - 1}^{- 1}}{v_{M - 1}/\beta_{p_{1}}}} \\ {{- v_{M - 1}^{H}}{T_{M - 1}^{- 1}/\beta_{p_{1}}}} & {{1/\beta_{p_{1}}} + {v_{M - 1}^{H}T_{M - 1}^{- 1}{v_{M - 1}/\beta_{p_{1}}^{2}}}} \end{bmatrix}},{where}}} & (35) \\ {T_{M - 1} = {R_{M - 1} - {v_{M - 1}{v_{M - 1}^{H}/\beta_{p_{1}}}}}} & (36) \end{matrix}$

[0117] is the Schur complement of β_(p) ₁ in Q_(M) ⁻¹ . As is well known to those skilled in the art, when a matrix A is partitioned into the form ${A = \begin{bmatrix} E & F \\ G & H \end{bmatrix}},$

[0118] then the Schur complement, S, of (partition) E in (matrix) A is S=H−GE⁻¹F.

[0119] Furthermore, from Equation (36), it may be deduced that: $\begin{matrix} {R_{M - 1}^{- 1} = {Q_{M - 1} = \left\lbrack {T_{M - 1} + {v_{M - 1}{v_{M - 1}^{H}/\beta_{p_{1}}}}} \right\rbrack^{- 1}}} & (37) \end{matrix}$

[0120] and using the (well-known) Sherman-Morrison formula, we obtain $\begin{matrix} {Q_{M - 1} = {T_{M - 1}^{- 1} - {\frac{T_{M - 1}^{- 1}v_{M - 1}v_{M - 1}^{H}T_{M - 1}^{- 1}}{\beta_{p_{1}} + {v_{M - 1}^{H}T_{M - 1}^{- 1}v_{M - 1}}}.}}} & (38) \end{matrix}$

[0121] Clearly, Equation (38) shows that the matrix Q can be advantageously deflated recursively. Specifically, in the general case: $\begin{matrix} {Q_{M - m} = {T_{M - m}^{- 1} - \frac{T_{M - m}^{- 1}v_{M - m}v_{M - m}^{H}T_{M - m}^{- 1}}{\beta_{p_{m}} + {v_{M - m}^{H}T_{M - m}^{- 1}v_{M - m}}}}} & (39) \\ {R_{M - m} = {{H_{M - m}^{H}H_{H - m}} + {\alpha \quad I_{{({M - m})} \times {({M - m})}}}}} & (40) \end{matrix}$

[0122] Note that R_(M−m) may advantageously be easily determined without direct computation. In particular, and in accordance with the illustrative embodiment of the present invention, R_(M−m) is determined from R_(M+1−m) by removing the last line and column thereof—only R_(M)=R is calculated directly, during the first iteration of the illustrative procedure.

[0123] Thus, returning to FIG. 5, the illustrative MIMO decoding technique in accordance with an illustrative embodiment of the present invention comprises the following steps.

[0124] Initialization Step (as Shown in Block 50 of the Figure):

[0125] Use the training sequence to determine the initial matrix H; determine the received signals x(k) and set x₁(k)=x(k) for the first iteration.

[0126] Step 1 (as shown in Block 51 of the Figure):

[0127] Compute R=R_(M) and Q=Q_(M) recursively—see Equations (26) & (28), above

[0128] Step 2 (as Shown in Block 52 of the Figure):

[0129] The element of y(k) with the highest SNR is detected. This element is associated with the smallest diagonal entry of Q for the MMSE filter. If such a column is p_(m), then the estimate of the chosen transmitted signal is given by:

ŝ _(p) _(m) (k)=Q[y _(p) _(m) (k)].

[0130] Step 3 (as Shown in Block 53 of the Figure):

[0131] Assuming that ŝ_(p) _(m) (k)=s_(p) _(m) (k), then s_(p) _(m) (k) is cancelled from the received vector x_(m)(k), resulting in a modified received vector, x_(m+1)(k).

[0132] Step 4 (as Shown in Block 54 of the Figure):

[0133] Assuming that l_(m) is the index of the chosen transmitted signal, the rows and columns l_(m) and M−m+1 (the last) are permuted in both R_(M+1−m) and Q_(M+1−m).

[0134] Step 5 (as Shown in Block 55 of the Figure):

[0135] Partition matrix R_(M+1−m) to determine R_(M−m), v_(M−m), and β_(p) _(m) and remove the last row and column of Q_(M+1−m) to determine T_(M−m) ⁻¹.

[0136] Step 6 (as Shown in Block 56 of the Figure):

[0137] Compute Q_(M−m) (recursively) using Equation (39).

[0138] Step 7 (as Shown in Block 57 of the Figure):

[0139] Unless all M transmitted signals have already been decoded, repeat steps 2-6 for components p₂,L,p_(M) by operating in turn on the progression of modified received vectors x₂(k),L,x_(M)(k).

[0140] The following more formally summarizes the illustrative MIMO decoding technique in accordance with the illustrative embodiment of the present invention as presented herein:

[0141] Initialization and First Iteration:

[0142] x₁(k)=x(k)

[0143] Compute R=R_(M) and Q=Q_(M) recursively—see Equations (26) & (28)

[0144] f(k)=[12LM]^(T)

[0145] l₁=arg min q_(M,ii), p₁=f_(l) ₁ (k) ${y_{p_{1}}(k)} = {\sum\limits_{i = 1}^{M}\quad {q_{M,{l_{1}t}}h_{:t}^{H}{x_{1}(k)}}}$

[0146] Interchange the entries l₁ and M of the vector f(k)

[0147] ŝ_(p) ₁ (k)=Q[y_(p) ₁ (k)]

[0148] Recursion (Subsequent Iterations), for m=1, 2, K, M−1:

[0149] (a) x_(m+1)(k)=x_(m)(k)−ŝ_(p) _(m) (k)h_(p) _(m)

[0150] (b) Permute the rows and columns l_(m) and M−m+1 of R_(M+1−m)

[0151] (c) Permute the rows and columns l_(m) and M−m+1 of Q_(M+1−m)

[0152] (d) Determine R_(M−m), v_(M−m), and β_(p) _(m) from R_(M+1−m) as shown above

[0153] (e) Determine T_(M − m)⁻¹

[0154]  from Q_(M+1−m) by removing its last row & column

[0155] (f) Compute Q_(M−m) recursively as shown above—see Equation (39)

[0156] (g) l_(m+1)=arg min q_(M−m,ii), p_(m+1)=f_(l) _(m+1) (k) ${(h)\quad {y_{p_{m + 1}}(k)}} = {\sum\limits_{i = 1}^{M - m}\quad {q_{{M - m},l_{m + 1^{l}}}h_{:{f_{i}{(k)}}}^{H}{x_{m + 1}(k)}}}$

[0157] (i) Interchange the entries l_(m+1) and M−m of the vector f(k)

[0158] (j) ŝ_(p) _(m+1) (k)=Q [y_(p) _(m+1) (k)]

[0159] Solutions:

[0160] The estimates of the transmitted signals: [ŝ_(p) ₁ (k)ŝ_(p) ₂ (k)Lŝ_(p) _(M) (k)]^(T)

[0161] The decoding order: f(k)=[p_(M)p_(M−1)Lp₁]^(T)

[0162] Addendum to the Detailed Description

[0163] It should be noted that all of the preceding discussion merely illustrates the general principles of the invention. It will be appreciated that those skilled in the art will be able to devise various other arrangements which, although not explicitly described or shown herein, embody the principles of the invention and are included within its spirit and scope. Furthermore, all examples and conditional language recited herein are principally intended expressly to be only for pedagogical purposes to aid the reader in understanding the principles of the invention and the concepts contributed by the inventors to furthering the art, and are to be construed as being without limitation to such specifically recited examples and conditions. Moreover, all statements herein reciting principles, aspects, and embodiments of the invention, as well as specific examples thereof, are intended to encompass both structural and functional equivalents thereof. It is also intended that such equivalents include both currently known equivalents as well as equivalents developed in the future—i.e., any elements developed that perform the same function, regardless of structure.

[0164] Thus, for example, it will be appreciated by those skilled in the art that the block diagrams herein represent conceptual views of illustrative circuitry embodying the principles of the invention. Similarly, it will be appreciated that any flow charts, flow diagrams, state transition diagrams, pseudocode, and the like represent various processes which may be substantially represented in computer readable medium and so executed by a computer or processor, whether or not such computer or processor is explicitly shown. Thus, the blocks shown, for example, in such flowcharts may be understood as potentially representing physical elements, which may, for example, be expressed in the instant claims as means for specifying particular functions such as are described in the flowchart blocks. Moreover, such flowchart blocks may also be understood as representing physical signals or stored physical data, which may, for example, be comprised in such aforementioned computer readable medium such as disc or semiconductor storage devices.

[0165] The functions of the various elements shown in the figures, including functional blocks labeled as “processors” or “modules” may be provided through the use of dedicated hardware as well as hardware capable of executing software in association with appropriate software. When provided by a processor, the functions may be provided by a single dedicated processor, by a single shared processor, or by a plurality of individual processors, some of which may be shared. Moreover, explicit use of the term “processor” or “controller” should not be construed to refer exclusively to hardware capable of executing software, and may implicitly include, without limitation, digital signal processor (DSP) hardware, read-only memory (ROM) for storing software, random access memory (RAM), and non-volatile storage. Other hardware, conventional and/or custom, may also be included. Similarly, any switches shown in the figures are conceptual only. Their function may be carried out through the operation of program logic, through dedicated logic, through the interaction of program control and dedicated logic, or even manually, the particular technique being selectable by the implementer as more specifically understood from the context. 

We claim:
 1. A method for detecting a plurality, M, of transmitted signals transmitted across a channel by respective transmitting antenna elements in a multiple-input multiple-output communications system, the method comprising the steps of: (a) collecting a plurality of received signals from respective receiving antenna elements in said communications system; (b) determining a channel matrix H of estimated channel coefficients based on said plurality of received signals; (c) computing an estimate of a selected one of said transmitted signals, said estimate based on said plurality of received signals and on an intermediate matrix Q, thereby resulting in detection of the selected one of said transmitted signals, wherein said intermediate matrix Q is a function of the channel matrix H; and (d) repeating at least step (c) one or more times to detect an additional one or more of said transmitted signals, wherein said intermediate matrix Q as used in step (c) for each such repeated execution thereof is re-computed based on a function of an inverse of a Schur complement of an element in the inverse of a modified version of the intermediate matrix Q used in the previous execution of step (c).
 2. The method of claim 1 further comprising the step of: modifying one or more of said received signals by at least partially canceling an effect of the detected signal from said received signals based on the computed estimated of said detected signal, and wherein said modified received signals are used in a subsequent repetition of step (c).
 3. The method of claim 1 wherein said selected one of said transmitted signals detected in each execution of step (c) is selected in accordance with a preferred order, and wherein said preferred order is based on signal-to-noise ratios of said transmitted signals as determined at each execution of step (c).
 4. The method of claim 1 wherein said channel matrix H is initially determined based on a transmission of a predetermined training sequence.
 5. The method of claim 1 wherein the intermediate matrix Q as used in step (c) in a first execution thereof is Q_(M)=[H_(M) ^(H)H_(M)+αI_(M×M)]⁻¹, where H_(M) is the determined channel matrix H, the operator ^(H) represents a conjugate transpose of a matrix, I_(M×M) represents an M×M identity matrix, and α is a predetermined constant based on a signal-to-noise ratio of the transmitted signals.
 6. The method of claim 1 wherein the modified version of the intermediate matrix Q is derived from the intermediate matrix Q by permuting (i) a matrix row and a matrix column which corresponds to said transmitted signal detected by said previous execution of step (c) with (ii) a matrix row and a matrix column which corresponds to a last one of said transmitted signals which has not yet been detected, respectively.
 7. The method of claim 6 wherein the intermediate matrix Q as used in step (c) in each repeated execution thereof is Q=Q_(m−1) where ${Q_{m - 1} = {T_{m - 1}^{- 1} - \frac{T_{m - 1}^{- 1}v_{m - 1}v_{m - 1}^{H}T_{m - 1}^{- 1}}{\beta_{p_{M - m + 1}} + {v_{m - 1}^{H}T_{m - 1}^{- 1}v_{m - 1}}}}},$

where T_(m − 1) = R_(m − 1) − v_(m − 1)v_(m − 1)^(H)/β_(p_(M − m + 1))

is the Schur complement of β_(p) _(M−m+1) in Q_(m) ⁻¹, where the operator ^(H) represents a conjugate transpose of a matrix, and where the modified version of the intermediate matrix Q as used in step (c) in each corresponding execution immediately preceding said repeated execution thereof, is Q_(m) where $Q_{m} = {\begin{bmatrix} R_{m - 1} & v_{m - 1} \\ v_{m - 1}^{H} & \beta_{p_{M - m + 1}} \end{bmatrix}^{- 1}.}$


8. An apparatus for detecting a plurality, M, of transmitted signals transmitted across a channel by respective transmitting antenna elements in a multiple-input multiple-output communications system, the apparatus comprising: a plurality of receiving antenna elements in said communications system for collecting a corresponding plurality of received signals; and a processor adapted to perform the steps of: (b) determining a channel matrix H of estimated channel coefficients based on said plurality of received signals; (c) computing an estimate of a selected one of said transmitted signals, said estimate based on said plurality of received signals and on an intermediate matrix Q, thereby resulting in detection of the selected one of said transmitted signals, wherein said intermediate matrix Q is a function of the channel matrix H; and (d) repeating at least step (c) one or more times to detect an additional one or more of said transmitted signals, wherein said intermediate matrix Q as used in step (c) for each such repeated execution thereof is re-computed based on a function of an inverse of a Schur complement of an element in the inverse of a modified version of the intermediate matrix Q used in the previous execution of step (c).
 9. The apparatus of claim 8, wherein said processor is further adapted to perform the additional step of: modifying one or more of said received signals by at least partially cancelling an effect of the detected signal from said received signals based on the computed estimated of said detected signal, and wherein said modified received signals are used in a subsequent repetition of step (c).
 10. The apparatus of claim 8 wherein said selected one of said transmitted signals detected in each execution of step (c) is selected in accordance with a preferred order, and wherein said preferred order is based on signal-to-noise ratios of said transmitted signals as determined at each execution of step (c).
 11. The apparatus of claim 8 wherein said channel matrix H is initially determined based on a transmission of a predetermined training sequence.
 12. The apparatus of claim 8 wherein the intermediate matrix Q as used in step (c) in a first execution thereof is Q_(M)=[H_(M) ^(H)H_(M)+αI_(M×M)]⁻¹, where H_(M) is the determined channel matrix H, the operator ^(H) represents a conjugate transpose of a matrix, I_(M×M) represents an M×M identity matrix, and α is a predetermined constant based on a signal-to-noise ratio of the transmitted signals.
 13. The apparatus of claim 8 wherein the modified version of the intermediate matrix Q is derived from the intermediate matrix Q by permuting (i) a matrix row and a matrix column which corresponds to said transmitted signal detected by said previous execution of step (c) with (ii) a matrix row and a matrix column which corresponds to a last one of said transmitted signals which has not yet been detected, respectively.
 14. The apparatus of claim 8 wherein the intermediate matrix Q as used in step (c) in each repeated execution thereof is Q=Q_(m−1) where ${Q_{m - 1} = {T_{m - 1}^{- 1} - \frac{T_{m - 1}^{- 1}v_{m - 1}v_{m - 1}^{H}T_{m - 1}^{- 1}}{\beta_{p_{M - m + 1}} + {v_{m - 1}^{H}T_{m - 1}^{- 1}v_{m - 1}}}}},$

where T_(m − 1) = R_(m − 1) − v_(m − 1)v_(m − 1)^(H)/β_(p_(M − m + 1))

is the Schur complement of β_(p) _(di M−m+1) in Q_(m) ⁻¹, where the operator ^(H) represents a conjugate transpose of a matrix, and where the modified version of the intermediate matrix Q as used in step (c) in each corresponding execution immediately preceding said repeated execution thereof, is Q_(m) where $Q_{m} = {\begin{bmatrix} R_{m - 1} & v_{m - 1} \\ v_{m - 1}^{H} & \beta_{p_{M - m + 1}} \end{bmatrix}^{- 1}.}$ 